Model Selection

Toxicity Detection

# Toxicity Detection

Toxic Prompt Roberta

A RoBERTa-based text classification model for detecting toxic prompts and responses in dialogue systems

Text Classification

A Roberta fine-tuned classification model for evaluating the toxicity level of responses.

Text Classification

ToxicityModel is a fine-tuned model based on RoBERTa, designed to assess the toxicity level of English sentences.

Text Classification

Transformers English

Reward Model Deberta V3 Large V2

This reward model is trained to predict which generated answer humans would prefer for a given question. Suitable for QA evaluation, RLHF reward scoring, and toxic answer detection.

Large Language Model

Transformers English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase